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We study folding in 16-monomer heteropolymers on the square lattice. For a given sequence, ther- 
modynamic properties and stability of the native state are unique. However, the kinetics of folding 
depends on the model of dynamics adopted for the time evolution of the system. We consider three 
such models: Rouse-like dynamics with either single monomer moves or with single and double 
monomer moves, and the 'slithering snake' dynamics. Usually, the snake dynamics has poorer fold- 
ing properties compared to the Rouse-like dynamics, but examples of opposite behavior can also be 
found. This behavior relates to which conformations act as local energy minima when their stability 
is checked against the moves of a particular dynamics. A characteristic temperature related to the 
combined probability, Pl, to stay in the non-native minima during folding coincides with the tem- 
perature of the fastest folding. Studies of Pl yield an easy numerical way to determine conditions 
of the optimal folding. 

PACS numbers: 87.15.By, 87.10.+e 
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Proteins that are found in nature fold rapidly to 
their native states when physiological conditions are 
restoredua. Random sequences of .amino acids, on the 
other hand, may take forever to folda or they may have a 
non-compact ground state. Lattice models have provided 
insights into the key problems of folding kinetics like the 
transition through the compactification stagea and exis- 
tence and characterization of the folding funnelETtj. For 
the lattice models, the dynamics needs to be declared and 
there are various ways to define the single step moves for 
a given Hamiltonian. The time evolution is then imple- 
mented by performing a Monte Carlo process or by using 
the Mast^euuationD. One usually adopts the Rouse-like 
dynamics! 1 ™ *l in which there are two kinds of motions: 
single- and double-monomer moves. The single- monomer 
move consists of end-flip and corner moves while the 
double-monomer move consists of the crankshaft-like ro- 
tation. Typically, one declares a certain proportion in 
which the two kinds of motions are attempted. For in- 
stance, one attempts single monomer moves with proba- 
bility CL2 and the double,monomer moves with probabil- 
ity 0.8til. Chan and Dillo have also studied an expanded 
set of moves in which a rotation of large segments of the 
polymer was also allowed. 

The thermodynamic stability of a lattice heteropoly- 
mer in its native state depends only on the energy spec- 
trum, i.e. on the Hamiltonian, but not on the dynamics 
itself. The thermodynamic stability may be character- 
ized by the folding temperature, Tf, defined by the value 
of temperature, T, at which the equilibrium probability 
to fold, Po, is |. The kinetic propensity to fold may be 
characterized by T m i n - the temperature at which the 
folding process is the fastest, or by the glass transition 
temperature, T g , below which the kinetics becomes so 
slow that folding is kinetically unlikelycJ. The definition 
of T g relies on the cutoff value of the characteristic fold- 
ing time whereas T m i n is defined uniquely and it seems 
preferable to use the latter. Below T min , an onset of 



the glassy effects takes place and the value of T m .;„ de- 
pends on the dynamics. Here, we demonstrate that this 
dependence is significant. In particular, we show that a 
sequence may not even fold if the dynamics is not chosen 
adequately. Furthermore, we demonstrate that T m i n is 
related to the combined probability, P L , for the sequence 
to be in non-native local energy minima before finding 
the native state. Specifically, we show that T m i n coin- 
cides with the temperature at which P L crosses h . Notice 
that whether a given conformation is a local energy min- 
imum or not depends on the set of the dynamics moves 
used because existence of stability against these moves is 
what defines a minimum. 

We study several heteropolymers on the square lattice 
and we consider three models of the dynamics: 1) stan- 
dard Rouse-like dynamics (RD) with the single and dou- 
ble moves applied with the proportions mentioned above, 
2) single monomer dynamics (RD1), and 3) the 'slithering 
snake' dynamics introduced by-Wall and Mandel (WMD 
- for Wall-Mandel dynamics)!!.! The latter model im- 
itates snake-like displacement that characterizes motion 
of polymers in dense solutions and has been introduced as 
a numerical implementation of the reptation model pro- 
posed by De GennesEj. It is also related to diffusion of a 
mobile chain through an ordered array of immobile ob- 
stacles. Briefly, the dynamics involves choosing randomly 
one end of the chain and then attempting to advance it to 
a new neighboring lattice site with the remaining chain 
following along the previous contour. On the square lat- 
tice, both head and tail can attempt to move to three 
new destinations each. The motion is not allowed if the 
end site would be occupied by the chain after the whole 
slithering displacement was accomplished. An example 
of the snake move is represented in Figure la. This kind 
of the dynamics is knownEj to lead to the well defined 
i 1 / 4 law for the mean square displacement of the cen- 
tral bead, and then to a Rouse-like t 1 / 2 law, before the 
asymptotic diffusive behavior is reached. Here, t denotes 
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time. Our motivation to consider the snake dynamics 
is mostly formal - we would like to discuss a dynamics 
which is clearly distinct from RD. It is conceivable, how- 
ever, that there may exist sufficiently dense environments 
in which reptation may turn out to represent the protein 
motion better. 
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FIG. 1. a) An example of the snake move in WMD: start- 
ing from the conformation at the bottom the chain can make 
a motion either to two adjacent conformations on the top-left 
and the top-right of the figure. The end bead of the chain 
finds a new position on the lattice and all the other beads 
'slither' forward along the previous contour by one lattice con- 
stant. Note that the conformation at the bottom is also the 
native conformation of sequence R. b) Non-ergodicity effects 
in 16-mer chain: the conformation on the left is not accessible 
from any other conformation for the move set present in RD, 
the same thing happens in WMD to the conformation on the 
right. 

Another issue iSpthat of ergodicity. As pointed out by 
e.g. Chan and Dillcl, the move set of Rouse-like dynamics 
is not ergodic for 16-mer chains on the square lattice. 
We can see this by considering the conformation shown 
on the left of Figure lb. This conformation can never 
be reached by the RD moves but it can by the WMD 
moves. The conformation on the right of Figure lb shows 
a behavior which is just the opposite. For longer chains, 
non-ergodicity may become significant. 

We consider three 16-monomer sequences on the square 
lattice. Two of them, R and DSKS', have couplings gen- 
erated with the Gaussian probability distribution and 



their values are listed in Ref. 6. Sequence R is con- 
structed by the rank-ordering technique that assigns the 
most strongly attractive couplings to the native contacts 
in a target structured. This sequence has been found to 
be a good folder under _thc RDB. Sequence DSKS', first 
studied by Dinner et alJa, is a bad folder within the same 
dynamics. We demonstrate that, under WMD, both se- 
quences become bad folders. We then consider an HP- 
sequencea, which we shall encode as HP 2, since it has 
two 2 polar and 14 hydrophobic beads, for which WMD 
provides better folding than RD. This sequence has the 
structure H-H-H-H-P-H-H-H-H-H-H-P-H-H-H-H and the 
corresponding native state is shown in Figure 2a. 
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FIG. 2. Native conformations of selected HP sequences 
with 2 polar beads. The filled and open circles denote hy- 
drophobic (H) and polar (P) amino acids respectively. 2a) 
corresponds to sequence HP2. 

In lattice models, an energy of a sequence in a confor- 
mation is given by 



E = 



E 

Kj 



Bij A(i - j) 



(1) 



where A(i — j) denotes presence of a contact between 
monomers i and j, i.e. A (i — j) is 1 if indices i and j 
belong to beads that are nearest neighbors on a lattice 
but are not neighbors along the sequence. Otherwise, 
A(i — j) is set equal to 0. Bjj are the corresponding- 
contact energies. Basically, in Gaussian model, -Bjj's 
have Gaussian values with a mean shifted by negative 



2 



numbers to provide compactness in the ground state. In 
the HP modelo there are only three types of contacts 
and their energies are —1, 0, for the H-H, H-P and P-P 
pairs respectively. The values of Tf are obtained by an 
exact enumeration of all conformations and are equal to 
1.15, 0.195, and 0.164 for sequences R, DSKS', and HP2 
respectively. 

Our Monte Carlo simulations have been doge in a way 
that satisfies the detailed balance conditionsEI and were 
devised along the lines described in ref.Q. For polymers, 
satisfying these conditions is non-trivial because each 
conformation has its own number, A, of allowed moves 
that the conformation can make. Thus the propensities 
to make a move in a time unit vary from conformation to 
conformation and the effective " activities" of the confor- 
mations need to be matched. This can be accomplished 
by first determining the maximum value of A, A max . 
For the 16-monomer chain on the square lattice, A max is 
equal to 18, if the dynamics corresponds to RD or RD1, 
and 6 in the case of WMD. We associate a single time unit 
with the conformations in which A—A max . This means 
that each allowed move is being attempted always with 
probability l/A max . For a conformation with A allowed 
moves, probability to attempt any move is then A/A max 
and probability not to do any attempt is 1 — A/A max . 
The attempted moves are then accepted or rejected as 
in the standard Metropolis procedure. This description 
holds for RD1 and WMD. In the case of RD, probability 
to do a single monomer move is additionally reduced by 
the factor of 0.2 and to do an allowed double monomer 
move - by 0.8. The time used in Figures 3-5 is equal to 
the total number of the Monte Carlo attempts divided 
by A max . This scheme not only establishes the detailed 
balance conditionsEI but it also uses less CPU compared 
to a process in which moves arc attempted with disregard 
to whether they are allowed or not. 

We have carried out Monte Carlo simulations to deter- 
mine the temperature dependence of the median folding 
time, tf id, for the three sequences and using the three 
models of the dynamics. Figure 3 and 4 show the results 
for R and DSKS' and Figure 5 is for HP2. For each tem- 
perature the median folding time is determined based on 
200 independent runs starting at random conformations 
(the cut off is set at value which is significantly above the 
lowest folding time). The data points shown are averaged 
over 5 to 10 simulations, each corresponding to 200 tra- 
jectories. The figures show that t/ D id depends on the 
dynamics in a sensitive way but the temperature depen- 
dence of t fold generally has a U-shape with a pronounced 
minimum (the minimum becomes rather broad only for 
DSKS' with the RD1). Sequence R is a good folder within 
RD1 and especially RD but it becomes a bad folder un- 
der WMD: T min is significantly above Tf. DSKS' and 
HP2, on the other hand, are both bad folders for all of 
the types of dynamics considered here. However, it is in- 
teresting to point out that for HP2 the WMD dynamics 
yields a T m i n which is comparable to that generated by 
RD and RD1 and the folding times themselves are sig- 



nificantly reduced under WMD. This situation, however, 
is uncommon: in most cases that we studied, including 
other HP sequences, except for those shown in Figures 
2a - 2d, WMD tends to make the folding poorer. This is 
because a snake-like move usually breaks many contacts. 
Sequence HP2 is also uncommon in another respect: it 
folds better under RD1 than under RD which suggests 
that for this sequence the crankshaft moves are much less 
favorable than the single moves. 
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FIG. 3. The median folding time versus temperature for 
sequence R for the three kinds of the dynamics: RD, RD1 
and WMD. The inset shows the temperature dependence of 
Pl ■ The arrow associated with Tf indicates the folding tem- 
perature. The other arrows indicate temperatures at which 
Pl crosses 0.5 for each type of the dynamics. Note that they 
are very close to Tmin- 

The geometry of the native conformation is also an 
important factor. Generally an HP sequence folds bet- 
ter under WMD than under RD if it has very few po- 
lar monomers, but this is not always the case. Figure 2 
shows the native states for several HP sequences with 2 
monomers of the P-type. The first four of them are fast 
under WMD but the last one (Figure 2e) is very slow. 
We have checked that moving out of the native state of 
Figure 2e by WMD involves a large energy barrier. 

It is commonly accepted that folding is a motion that 
takes place in a rugged energy landscapes, which involves 
crossing many energy barriers. The barriers arise due to 
the presence of local energy minima (LM) in the system. 
The role of the minima can be assessed by determining 
Pl - the probability to encounter LM's on the way to 
folding. This probability is defined as the fraction of time 
spent in the LM's relative to the full folding time. This 
quantity depends on the dynamics explicitly: not only 
through the definition of what conformation constitutes 
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a minimum, which could be analyzed by studying the en- 
ergy spectrum, but also through the fact that the associ- 
ated weights are not necessarily Boltzmannian. The local 
minima themselves play a crucial role in some schemes 
to coarse grain the description of the folding process!! 
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FIG. 4. Same as Figure 3 but for sequence DSKS'. 
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FIG. 5. Same as Figure 3 but for sequence HP2. 



There are two kinds of conformations that are LM's: 
V-shaped, if the energy of the the conformation is lower 
than the energies of all conformations that are imme- 
diately accessible from it, and U-shaped, if one cannot 



reach states which are lower in energy but some of the 
allowed moves leave the energy unchanged. For the 16- 
mer model there are 802075 possible conformations and 
only a small fraction, /, of these makes minima. With 
the RD dynamics, there are 9103 LM's for sequence R 
out of which 2024 are U-shaped. Each of the U-shaped 
minima consists of several states thus the total number 
of states involved in the U-shaped minima is 4893. The 
total number of states which are minima of whatever 
kind is then 11972 which makes about f=1.5% of the 
phase space. The corresponding numbers for the other 
sequences and other types of the dynamics are shown in 
Table I. Checking whether a conformation found on a 
Monte Carlo trajectory is a local energy minimum or not 
enhances the CPU by about 50%. An incorporation of 
the detailed balance conditions already involves an enu- 
meration of the possible moves but checking for the min- 
ima requires an additional determination of the resulting 
energies. Furthermore, checking wether the minimum is 
U-shaped requires probing possible trajectories within a 
cutoff number of steps. 



Sequence 


RD 


RD1 


WMD 


R 


11972 (4 893) 


16 425 (8 253) 


149443 


DSKS' 


12 373 (5 202) 


15 851 (7 846) 


150 835 


HP 2 


12 606 (5 024) 


19142 (10 093) 


103 363 



TAB. 1. The total number of conformations that are LM's 
for each of the sequences for the three kinds of dynamics. The 
numbers in the brackets correspond to conformations which 
are in the U-shaped LM's. In case of WMD there are no 
U-shaped LM's. 

For the WMD dynamics, the minima cut out an order 
of magnitude larger portion of the phase space which al- 
ters the energy landscape dramatically. Notice also that 
when a chain makes a snake move the set of new contacts 
usually has no overlap with the preceding set of contacts. 
In the Rouse-like dynamics, on the other hand, the con- 
formations that immediately connect to the native state 
have sets of contacts which are overlapping to a large 
extent. 

The small fraction of the phase space that corresponds 
to LM's freezes the kinetics out at T=0. Thus at T=0 we 
get Pl = 1 whereas at high temperatures Pl is of order 
f - as shown in Figure 6 for sequence R and DSKS'. 
There is then a crossover temperature Tl at which Pl 
crosses | . Notice that there is no such crossover behavior 
for the quantity P^ which corresponds to Pl with the 
weights calculated at equilibrium - through the partition 
function. The reason is that at T=0 it is only the native 
state that has the occupation of 1. Instead, P^ has a 
maximum, which in the case of the sequence R, under 
RD coincides with T m i n . In other cases, like for DSKS' 
as shown in Figure 6, the peak in P^ q is substantially 
displaced away from T m i n towards lower temperatures. 
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FIG. 6. Plots of Po, Pl, and P^ q vs. temperature for 
sequence R (top) and DSKS' (bottom) under RD. Po and 
P^ q are obtained through the exact enumeration of states. 
The data points for Pl that are marked by the open circles 
are averaged over 5 MC trajectories whereas those marked by 
black circles - over 50 trajectories. 

The plots of P L vs. T are shown in the insets of Fig- 
ures 3,4, and 5. The data points shown are averaged over 
50 trajectories. The striking observation is that T L ap- 
pears to coincide with T m „ for any sequence and for any 
model of the dynamics that we studied. In other words, 
folding turns out to be the most favorable at a temper- 
ature when half of its time the sequence spends in the 
local minima during folding. Thus T L is a measure of 
temperature below which kinetic trapping in the minima 
becomes substantial. T L then conveys the same physics 
as contained in T m i n . We have observed that T L is much 
easier to calculate than T m j n because Pl, at any T, con- 
verges to a well determined value quite fast: it becomes 
reliable already after several runs - as demonstrated in 
Figure 6. For good folders, the temperature at which P^ q 
has a maximum is expected to be somewhere around Tf 
because the maximum signifies an onset of a substantial 
equilibrium occupation of the native state and the folding 
funnel dominates the energy landscape. For bad folders, 
on the other hand, we find that there is essentially no 
relationship between the temperature of the maximum 
and Tf because the non-native minima are blocking for- 
mation of the native funnel and the position of the max- 
imum is dominated by the nature of the dynamics. The 
relationship between T L and T m i n is well defined both for 
bad and good folders because the definitions of the two 



temperatures are anchored to the dynamics. 

In summary, we have shown that the kinetics of folding 
strongly depends on the details of the dynamics. Wc 
have also provided a simplified method to determine the 
temperature of the fastest folding. The method is based 
on monitoring the combined occupation of the non-native 
local energy minima. We have also indicated that, for 
good folders, the folding temperature can be estimated 
by studying equilibrium occupation of the minima. Thus 
the essential characteristics of well folding sequences can 
be obtained by focusing on the energy minima instead 
of on the native state itself. This may offer approximate 
ways to study longer sequences. 

We thank for discussions with M. S. Li. This research 
has been supported by KBN (Grant number 2P03B-025- 
13). 
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